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Abstract. We consider contractivity for diffusion semigroups w.r.t. Kantoro- 
vich (L 1 Wasserstein) distances based on appropriately chosen concave func- 
tions. These distances are inbetween total variation and usual Wasserstein dis- 
tances. It is shown that by appropriate explicit choices of the underlying dis- 
tance, contractivity with rates of close to optimal order can be obtained in several 
fundamental classes of examples where contractivity w.r.t. standard Wasserstein 
distances fails. Applications include overdamped Langevin diffusions with lo- 
cally non-convex potentials, products of these processes, and systems of weakly 
interacting diffusions. 



I. Introduction 

Consider a diffusion process (X t )t>o in ^ d defined by a stochastic differential 
equation 

(1.1) dX t = b(X t ) dt + a dB t . 

Here (B t ) t > is a d- dimensional Brownian motion, a £ R dxd is a constant d x d 
matrix with det a > 0, and b : R d ->■ R d is a locally Lipschitz continuous function. 
We assume that the unique strong solution of (1.1) is non-explosive, which is 
essentially a consequence of the assumptions imposed further below. The transition 
kernels of the diffusion process on M. d defined by (1.1) will be denoted by pt(x, dy). 

Contraction properties of the transition semigroup (pt)t>o have been studied 
by various approaches. In particular, L 2 and entropy methods yield bounds that 
both are relatively stable under perturbations and applicable in high dimensions, 
cf. e.g. [2, 3, 1, 32, 36, 4]. On the other hand, coupling methods provide a more 
intuitive probabilistic understanding of convergence to equilibrium [30, 29, 11, 10, 
13, 36, 34]. In contrast to L 2 and entropy methods, bounds resulting from coupling 
methods typically hold for arbitrary initial values Xq £ M. d . In many applications, 
couplings are used to bound the total variation distances dTv(nPt,vp t ) between 
the laws np t and vp t of X t w.r.t. two different initial distributions \i and v at a 
given time t > , cf . [30, 29]. Typically, however, the total variation distance is 
decaying substantially only after a certain amount of time, and the map fi i— > upt 
is not a contraction for small times t. This is also manifested in cut-off phenomena 
[15, 28, 16, 9]. 

Alternatively, it is well-known that basic couplings (i.e., couplings given by the 
flow of the s.d.e. (1.1)) can be used to show that the map \x (->■ \ipt is exponentially 
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contractive w.r.t. L p Wasserstein distances W p for any p G [l,oo) if, for exam- 
ple, (Xt) is an overdamped Langevin diffusion with a strictly convex potential 
U G C 2 {R d ), i.e., a = Id and b = — VC//2. This leads to an elegant and powerful 
approach to convergence to equilibrium and to many related results if applicable. 
However, it has been pointed out in [35] that strict convexity of U is also a nec- 
essary condition for exponential contractivity w.r.t. W p . This seems to limit the 
applicability substantially. 

Here, we are instead considering exponential contractivity w.r.t. Kantorovich 
(L 1 Wasserstein) distances Wf based on underlying distance functions of the form 

d f (x,y) = f(\\x-y\\) on M d , 

and, more generally, 

n 

d f (x,y) = ^MWx'-y'W) on l dl x-xl 4 , 

i=l 

where /, fa : [0, oo) — > [0, oo) are strictly increasing concave functions, cf. Sections 
2.1 and 3.1 below for details. For proving exponential contractivity, we will apply a 
reflection coupling on M. d and an (approximate) componentwise reflection coupling 
on products of Euclidean spaces. It will become clear by the proofs below, that 
for distances based on concave functions /, f\, these couplings are superior to basic 
couplings, whereas the basic couplings are superior w.r.t. the Wasserstein distances 
W p for p > 1, cf. e.g. Lemma 6.2. 

The idea to study contraction properties w.r.t. Kantorovich distances based 
on concave distance functions goes back to Chen and Wang [12, 13], where it is 
implicitly contained in the proofs. Indeed, in [13] and [36], Chen and Wang apply 
very similar methods to estimate spectral gaps of diffusion generators on IR d and 
on manifolds. Related arguments have also been applied by Hairer and Mattingly 
in [20] to quantify exponential ergodicity in infinite dimensional situations. The 
key idea in the results below is to "almost" optimize the choice of the functions / 
and fi to obtain a large exponential decay rate. In the case n — 1, this idea has 
been exploited intensively by Chen and Wang in [13] to derive lower bounds for 
spectral gaps. The novelty here is that we suggest a simple and very explicit choice 
for / that leads to close to optimal results in several examples. The extension to 
the product case based on an approximate componentwise reflection coupling then 
enables us to obtain dimension free contraction results in product models and 
perturbations thereof without relying on convexity. 

Example 1.1 (Overdamped Langevin dynamics with locally non-convex 
potential). Suppose that a = Id and b(x) = — |Vt/(x) for a function U G C 2 (lR d ) 
that is strictly convex outside a given ball B C M d . Then Z := J exp(—U(x))dx is 
finite, and the probability measure 

dp, = Z^ 1 exp(— U) dx 

is a stationary distribution for the diffusion process (X t ). Corollary 2.3 below 
yields exponential contractivity for the transition semigroup (p t ) with an explicit 



COUPLINGS AND CONTRACTIVITY FOR DIFFUSIONS 



3 



rate w.r.t. an appropriate Kantorovich distance Wf. As a consequence, we obtain 
dimension-independent upper bounds for the standard L l Wasserstein distances 
between the laws vp t of X t and \i for arbitrary initial distributions v and t > 
0. These bounds are of of optimal order in R, L G [0, oo) and K G (0, oo) if 
(x — y) ■ (VU(x) — VU(y)) is bounded from below by — L\x — y\ 2 for \x — y\ < R 
and by K\x — y\ 2 for \x — y\ > R. 

Example 1.2 (Product models). For a diffusion process X t = (A/, . . . , A") in 
M. n ' d with independent Langevin diffusions X 1 , . . . , X n as in Example 1.1, Theorem 
3.1 below yields exponential contractivity in an appropriate Kantorovich distance 
with rate c = min(ci, . . . , c n ) where ci, . . . , c n are the lower bounds obtained for 
the contraction rates of the components. 

Example 1.3 (Systems of interacting diffusions). More generally, consider a 
system 

1 n 
dXl = --VU(Xl) dt--J2 W(Xl - X{) dt + dBl i = l,...,n, 

of n interacting diffusion processes in W 1 where U G C 2 (JBL d ) is strictly convex 
outside a ball, V G C 2 (R d ) has bounded second derivatives, and B 1 , . . . ,B n are 
independent Brownian motions in M. d . Then Corollary 3.4 below shows that for a 
sufficiently small, exponential contractivity holds in an appropriate Kantorovich 
distance with a rate that does not depend on n. 

Before stating the results in detail, we briefly introduce the couplings to be 
considered in the proofs below: 

A coupling by reflection of two solutions of (1.1) with initial distributions \x and 
v is a diffusion process (X t ,Y t ) with values in R 2d defined by (X ,Y ) ~ r\ where 
r] is a coupling of n and u, 

dX t = b(X t ) dt + a dB t for t > 0, 

(1.2) dY t = b(Y t )dt + a(I -2e t e])dB t for t < T, Y t = X t for t>T. 
Here etej is the orthogonal projection onto the unit vector 

e t :=<T- 1 {X t -Y t )/\o-- 1 (X t -Y t )\, 

and T = inf{t > : X t = Y t } is the coupling time, i.e., the first hitting time of 
the diagonal A = {(x,y) G M. 2d : x = y}, cf. [30, 11]. The reflection coupling can 
be realized as a diffusion process in M. 2d , and the marginal processes (X t )t>o and 
(Yt)t>o are solutions of (1.1) w.r.t. the Brownian motions B t and 

B t = [ {I d -2I {s<T} e s eJ)dB s . 
Jo 

Notice that by Levy's characterization, B is indeed a Brownian motion since the 
process Id — 2I{ s< T}e s eJ takes values in the orthogonal matrices. The difference 
vector 

Zt := X t -Y t 
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solves the s.d.e. 



(1.3) 



dZ t 



(b(X t ) - b(Y t )) dt + 2\a- 1 Z t \- 1 Z t dW t for t < T, 
for t > T, 



w.r.t. the one-dimensional Brownian motion 



w t 



I 



ej dB. 



A basic coupling of two solutions of (1.1) is defined correspondingly with e f = 0, 
i.e., the same noise is applied both to X t and Y t . Below we will also consider mixed 
couplings that are reflection couplings for certain values of Z t , basic couplings for 
other values of Zt, and mixtures of both types of couplings for Z t in an intermediate 
region. Notice that the standard reflection coupling introduced above is a basic 
coupling for t >T, i.e., if Z t = ! 

More generally, we will consider couplings for diffusion processes on product 
spaces (such as in Examples 1.2 and 1.3) that are approximately componentwise 
reflection couplings, i.e., the 2-th component {X\, Y£) of the coupling (X t ,Y t ) is 
defined similarly to (1.2) provided \X\ — Y t l \ > 5 for a given constant 8 > 0, cf. 
Section 6 below. 

For diffusion processes with non-constant diffusion matrix a(x), the reflection 
coupling should be replaced by the Kendall-Cranston coupling w.r.t. the intrinsic 
Riemannian metric G(x) = {a(x)a(x) T ^} induced by the diffusion coefficients, cf. 
[26, 14, 23, 36]. Here, we restrict ourselves to the case of constant diffusion matrices 
where the Kendall- Cranston coupling coincides with the standard coupling by 
reflection. 

In the next two sections we state the main results of this paper. A part of the 
results in Section 2 have been announced in the Comptes Rendus Note [17]. 



2.1. Reflection couplings and contractivity on W 1 '. Lindvall and Rogers [30] 
introduced coupling by reflection in order to derive upper bounds for the total 
variation distance of the distributions of X t and Y t at a given time t > 0. Here we 
are instead considering the Kantorovich-Rubinstein (L 1 -Wasserstein) distances 



(2.1) W f (n,v) = inf / d f (x,y)r](dxdy), d f (x,y) = f(\\x-y\\) (x,y eR d ), 



of probability measures fi, v on M. d , where the infimum is over all couplings rj of \i 
and u, f : [0, oo) —> [0, oo) is an appropriately chosen concave increasing function 
with /(0) = 0, and ||z|| = yz ■ Gz with G G M. dxd symmetric and strictly positive 
definite. Typical choices for the norm are the Euclidean norm ||^|| = \z\ and 
the intrinsic metric = | cr" 1 ^ | corresponding to G = Id and G = (aa T )~ 1 
respectively. 



2. Main results for reflection coupling 
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Remark 2.1 (Interpolating between total variation and Wasserstein dis- 
tances). For the choice of the function / there are two extreme cases with minimal 
and maximal concavity: 

(1) Choosing f(x) = x yields the standard Kantorovich (L 1 Wasserstein) 
distance Wf = W 1 . In this case it is well known that if, for example, 
G = a = Id and b(x) = —'VU(x)/2, then the transition kernels pt(x, dy) of 
the diffusion process (X t ) satisfy 

Wf(fip t , vp t ) < e~ Kt/2 Wf((A, v) for any p, v and t > 0, 

provided V 2 U > K ■ Id holds globally. This condition is also sharp in the 
sense that if U is not globally strictly convex, then contractivity of pt w.r.t. 
Wf does not hold, cf. Sturm and von Renesse [35]. 

(2) On the other hand, choosing f(x) = I(o t00 )(x) yields the total variation 
distance Wf = dxv- In this case, 

Wf([ip t , vpt) < P[T > t] for any /i, v and t > 0, 

but there is no contractivity of pt w.r.t. dxv i n general. Indeed, in many ap- 
plications dTvi^Pt, v Pt) only decreases substantially after a certain amount 
of time ("cut-off phenomenon"). 

Following Chen and Wang [13], we will argue that by choosing for / an appropri- 
ate concave function, exponential contractivity w.r.t. Wf holds even without global 
convexity. We now describe how the function / can be chosen in a very explicit 
way such that the obtained exponential decay rate w.r.t. the Kantorovich distance 
Wf differs from the maximal decay rate that we can achieve by our approach based 
on reflection coupling only by a constant factor. 

At first, similarly to Lindvall and Rogers [30], let us define for r e (0, oo): 

A lo-Hx-y)] 2 (x-y)-G(b(x)-b(y)) , 
«(r) = inf <^ -2 ± yi „ v v ,/„ ^ : x,y eR d s.t. \\x - y = 

L If - y\\ If ~ y\\ 

i.e., k(t) is the largest constant such that 

(2.2) (x - y) ■ G(b(x) - b(y)) < -\k{t) \\x - yt/\^\x - y) | 2 

holds for any x,y G M d with \\x — y\\ = r. Notice that if || ■ || is the intrinsic metric 
then the factor |<j _1 (x — y)| 2 /||x — y\\ 2 equals 1 . In Example 1.1 with G = Id, we 
have 

«(r) = inf | jT d 2 {x _ y)/lx _ yl U((l - t)x + ty) dt:x,yeR d s.t. \x - y\ = r| . 
We assume from now on that n(r) is a continuous function on (0, oo) satisfying 

(2.3) liminf ft(r) > and / m(r)~ dr < oo. 

r->oo J Q 

In Example 1.1 with G = Id, this assumption is satisfied if U is strictly convex 
outside a ball. 
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Next, we define constants Ro, R\ G [0, oo) with _R < Ri by 

(2.4) R = mi{R > : «(r) > Vr > i?}, 

(2.5) #i = inf{i? > i? : K,(r)R(R -R )>8\/r> R}, 

Notice that by (2.3), both constants are finite. We now consider the particular 
distance function df(x,y) = f(\\x — y\\) given by 

r 

(2.6) f(r) = I ip(s)g(s) ds, where 
ip(r) = exp I — - / sk(s)~ ds I , $(r) = / ip(s) ds, 




g(r) = 

Let us summarize some basic properties of the functions <p,g and /: 

• ip is decreasing, </?(0) = 1, and ip(r) = f(Ro) for any r > R , 

• g is decreasing, g(0) = 1, and p(r) = \ for any r > R±, 

• / is concave, /(0) = 0, /'(0) = 1, and 

(2.7) $(r)/2 < /(r) < $(r) for any r > 0. 

The last statement shows that df and <i$ as well as Wf and W 7 ^ differ at most by 
a factor 2. 

We will explain in Section 4 below how the choice of / is obtained by trying to 
maximize the exponential decay rate. Let us now state our first main result which 
will also be proven in Section 4. 

Theorem 2.2 (Exponential contractivity of reflection coupling). Let a : = 

sup{|cr _1 ^| 2 : z G M. d with \\z\\ = 1}, and define c G (0, oo) by 

Rx i?l 8 / s \ 

(2.8) - — a J ^(s)ip(s)~ 1 ds = aJJ exp i^J uk(u)~ du J dtds. 

o o o \ t I 

Then for the distance df given by (2.1) and (2.6), the function t i— > e ct K[df(X t , Y t )} 
is decreasing on [0, oo). 

The theorem yields exponential contractivity at rate c > for the transition 
kernels p t of (1.1) w.r.t. the Kantorovich distance Wf. Moreover, it implies upper 
bounds for the standard Kantorovich (L 1 Wasserstein) distance W l = Wid w.r.t. 
the distance function d(x,y) = \\x — y\\: 

Corollary 2.3. For any t > and any probability measures /i, v on M. d , 

(2.9) Wf(fip t ,up t ) < exp(-ct)W f (n,u), and 

(2.10) W\ m ,vp t ) < 2 V (R )- 1 exp(-ct)W\ f i,u). 
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Note that the second estimate follows from the first, since by the properties of 
tp and g stated above, tp(Ro)/2 < f < 1, and hence 

(2.11) <p(Ro)\\x - y\\/2 < d f (x,y) < \\x - y\\ for any 1,1/6 M. d . 

The corollary yields an upper bound for mixing times w.r.t. the Kantorovich 
distance W 1 . For e > let 

r w i(e) := inf{t>0 : W\ m , vp t ) < eW\(i, v) V/i, v e M^)}. 

Then by Corollary 2.3, 

r w i(e) < c~ 1 log(2/(e(p(R ))) for any e > 0. 

The proofs of Theorem 2.2 and Corollary 2.3 are given in Section 4 below. 

Remark 2.4 (Non-constant diffusion coefficients). The methods and results 
presented above have natural extensions to diffusion processes with non-constant 
diffusion matrices. In that case, one possibility is to use an ad hoc coupling 
as in [30], but this leads to restrictive assumptions and bounds that are from 
optimal. A better approach is to switch to a Riemannian setup where the metric 
is the intrinsic metric G(x) = (cr(x)a(x) T )~ 1 given by the diffusion coefficients. 
Then by replacing the Reflection Coupling by the corresponding Kendall-Cranston 
coupling, one should expect similar results as above. 

2.2. Consequences. We summarize some important consequences of exponential 
contractivity w.r.t. Kantorovich distances as stated in Corollary 2.3. These conse- 
quences are essentially well-known, cf. e.g. Joulin and Ollivier [25], Joulin [24] and 
Komorowski and Walczuk [27] for related results in discrete and continuous time 
respectively. For the reader's convenience, the proofs are nevertheless included in 
Section 4 below. 

We assume from now on that ||z|| = |cr _1 z| is the intrinsic metric, b is in 
C 1 (R d ,R d ), and 

(2.12) J \z\p t (x ,dz) < oo 

holds for some xq G M. d and any t > 0. Then, equivalently to (2.9), Theorem 2.2 
implies Lipschitz contractivity for the transition semigroup 



(p t g)(x) = j g(z)p t (x,dz) 

w.r.t. the metric df, i.e., 

(2-13) \\ptg\\u P (f) < exp(-ct) \\g\\u P (f) 

holds for any t > and any Lipschitz continuous function g : IR d — > M, where 

H H f lg(g) -9{y)\ c w , / 

IMlLi P (/) = sup | — — :x,yEK s.t. x ^ y 

denotes the Lipschitz semi-norm w.r.t. df. An immediate consequence is the exis- 
tence of a unique stationary distribution \i with finite second moments: 
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Corollary 2.5 (Convergence to equilibrium). There exists a unique stationary 
distribution fi of (pt)t>o satisfying J \y\ fi(dy) < oo, and 

(2.14) Vai^g) < (2c)- 1 !! 

flUilipCf) f or an V Lipschitz continuous g : R d — > R. 
Moreover, for any probability measure v on M. d , 

(2.15) Wf(fi, vp t ) < exp(-ct) Wf(fi, v) for any t > 0. 

We refer to [5, 8] for other recent results on convergence to equilibrium of diffu- 
sion processes in Wasserstein distances. 

Further important consequences of (2.13) are quantitative non-asymptotic bounds 
for the decay of correlations and the bias and variance of ergodic averages. Let 
x G M. d and suppose that (X, P) is a solution of (1.1) with initial condition X = xq 
a.s. 

Corollary 2.6 (Decay of correlations). For any Lipschitz continuous functions 
g,h :R d and s,t> 0, 

Cov(g(X t ),h(X t+s )) < 2c e-^|fll| Lip(/) lH| Lip(/) . 

Corollary 2.7 (Bias and variance of ergodic averages). For any Lipschitz 
continuous function g : M d — > 1R and t G (0, oo), 

* f 1 1 - e~ ct f 

g(X s )ds- gdfi < — - — IMkipCf) d f (x ,y) fi(dy), and 

In the variance estimate in Corollary 2.7, one of the factors 1/c is due to the 
variance bound (2.14) w.r.t. the stationary distribution whereas the second factor 
1/c bounds the decay rate for the correlations. Short proofs of Corollaries 2.5, 2.6, 
and 2.7 are included in Section 4. 

Remark 2.8 (Central Limit Theorem, Gaussian deviation inequality). 

The contractivity w.r.t. Wf can also be used to prove a central limit theorem 
for the ergodic averages [27] and a Gaussian deviation inequality strengthening 
Corollary 2.7, cf. Remark 2.10 in [24]. 







E 






.tJo 



2.3. First examples. In order to illustrate the bounds given in Theorem 2.2 and 
in Corollary 2.3, we estimate the constant c defined by (2.8) in different scenarios, 
and we study the behaviour of c under perturbations of the drift b. 

We first consider the situation where k is bounded from below by a negative 
constant for any r, and by a positive constant for large r: 
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Lemma 2.9 (Contractivity under lower bounds on k). Suppose that 

(2.16) ft(r) > — L for r < R, and n(r) > K for r > R 

hold with constants R, L G [0, oo) and K G (0, oo). If LRq < 8 then 

e _ 1 3e 

(2.17) orV 1 < -^-R 2 + eVW^R + AK^ 1 < — max( J R 2 , 8K' 1 ), 

and if LRq > 8 then 

(2.18) orV 1 < sV^R-'L-^iL' 1 + K" 1 ) exp + 32R- 2 K~ 2 . 

Remark 2.10 (Convex case). If L = then the bound in (2.17) improves to 

(2.19) orV 1 < 2max(R 2 ,2K~ 1 ). 

The proofs of Lemma 2.9 and Remark 2.10 are given in Section 5 below. 

In the first case considered in the lemma, the constant c is at least of order 
min(i? -2 , K). Even if L = (convex case), this order can not be improved as one- 
dimensional Langevin diffusions with potential U(x) = Kx 2 /2, or, respectively, 
with vanishing drift on (-R/2, R/2) demonstrate. In particular, for U(x) = Kx 2 /2 
with K > 0, the distance Wf is equivalent to W l , and the exact decay rate is 
K/2. This differs from the bounds in (2.19) and (2.17) only by a factor 2, 6e 
respectively. Thus if LRq is not too large, the contractivity properties are not 
affected substantially by non- convexity. 

In the second case (LRq > 8), if K > const. • L then the upper bound for c _1 is 
of order L~ 3//2 i? _1 exp(Li? 2 /8). This order in R and L is again optimal: 

Example 2.11 (Double-well potential with U"{x) = -L for \x\ < R/2). 

Consider a Langevin diffusion in IR 1 with a symmetric potential U G C 2 (M) satis- 
fying U(x) = -Lx 2 /2 for x G [-R/2,R/2], U" > -L, and liminf kKoo U"(x) > 0. 
If || • || is the Euclidean norm then k(t) = —L for r G (0, R}. On the other hand, 
let To = inf{t > : X t = 0} denote the first hitting time of 0. Then for any initial 
condition xo > 0, 

(2.20) lim r 1 log P xo [t >t] = -A!(0,oo) 

t— >oo 

where — Ai(0, oo) is the first Dirichlet eigenvalue of the generator Cv = (v"—U'v')/2 
on (0, oo), cf. [19] or see Section 5 below for a short proof of the corresponding 
lower bound that is relevant here. If LR 2 > 4 then by inserting the function g(x) = 
mm(\/Lx, 1) into the variational characterization of the Dirichlet eigenvalue, we 
obtain the upper bound 

(2.21) Ai(0,oo) < ^e 1/2 L 3/2 J Rexp(-Li? 2 /8), 

cf. Section 5 below. The estimates (2.20) and (2.21) seem to indicate that for 
x > 0, the Kantorovich distance W 1 (S_ xo p t , S X0 Pt) decays at most with a rate 
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of order L 3 ^ 2 Rexp(—LR 2 /8). Indeed, under appropriate growth assumptions on 
U(x) for \x\ > R, one can prove that 

P/?. [t > t] > 3/4 for any t < Ai(0, oo)^ 1 / 4 ; 

cf. Section 5. Hence for t < 3~ 1 e~ 1 ^ 2 L~ 3 ^ 2 R~ 1 exp(Li? 2 /8), the Kantorovich dis- 
tance W l (5Rp t , aO between 5npt and the stationary distribution \i is bounded from 
below by a strictly positive constant that does not depend on L and R if LR 2 > 4. 

For analysing the behaviour of c under perturbations of the drift, we assume 
that H^ll = |cr -1 z| is the intrinsic metric corresponding to the diffusion matrix, 
i.e., G = (cro" T ) _1 . Suppose that 

(2.22) b(x) = b (x) + 7(2;) for any x G R 

with locally Lipschitz continuous functions bo, 7 : M. d — > M d . For r > let 

(2.23) «*(r) = inf(-2 ~ v) ' G(bo(x) - b (y)) _ Ri ^ || x _ y || =r l 

be defined analogously to «(r) with b replaced by b . We assume that k satisfies 
the assumptions (2.3) imposed on k above, and we define Rq and R\ similarly 
to (2.4) and (2.5) but with n replaced by k . Now suppose that there exists a 
constant R< Rq such that 

(2.24) (x-y) ■ (7(x) - 7(7/)) < for any x, y e M. d s.t. ||x - y\\ > R. 

Then n(r) > K (r) for r > R, and hence the constants Rq and R\ defined w.r.t. b 
are smaller than the corresponding constants defined w.r.t. bQ. In this situation, 
we can compare the lower bounds c and Cq for the contraction rates w.r.t. b and 
b given by (2.8): 

Lemma 2.12 (Bounded and Lipschitz perturbations). Suppose that the drift 
b : M. d — y M. d is given by (2.22) with bo and 7 satisfying the assumptions stated 
above, and let c and cq denote the lower bounds for the contraction rates w.r.t. b 
and bo given by (2.8). 

(1) 1/7 is bounded and (2.24) holds for a constant R G [0,i?o] then 

(2.25) c > c exp(-i?sup ||7||). 

(2) 1/7 satisfies the one-sided Lipschitz condition 

(2.26) {x - y) ■ G(j{x) - 7(2/)) < L-||a;-y|| 2 Vi,i/GR rf 

with a finite constant L G [0, 00) and (2.24 ) holds for a constant R G [0, _R ] 
then 

(2.27) c > c exp(-Li? 2 /4). 

Remark 2.13. The condition R < Rq is required in Lemma 2.12. If (2.24) does 
not hold for x, y G M. d with ||x — y\\ > Rq then the constants Ro(b) and R\{b) 
defined w.r.t. b are in general greater than the corresponding constants defined 
w.r.t. bo, i.e., the region of non-convexity increases by adding the drift 7. This will 
also affect the bound in (2.8) significantly. 
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The proof of Lemma 2.12 is given in Section 5. 

2.4. Local contractivity and a high-dimensional example. Consider again 
the setup in Section 2.1. In some applications, the condition liminf r _ >0O k(t) > 
imposed above is not satisfied, but the diffusion process will stay inside a ball B C 
M. d for a long time with high probability. In this case, one can still prove exponential 
contractivity up to an error term that is determined by the exit probabilities from 
the ball. Corresponding estimates are useful to prove non-asymptotic error bounds, 
i.e., for fixed t G (0, oo), cf. e.g. [7, 6, 18]. 

Fix R G (0, oo) and let Wf R denote the Kantorovich distance based on the 
distance function df R (x,y) = /r(||x — y\\) given by 

(2.28) f R (r) = / cp(s)g R (s)ds for r > 0, 



where ip and $ are defined by (2.6), and 



Notice that 

g R (r) = and f R (r) = f R (R) for any r > R, 
i.e., we have cut the distance at f R (R). 

Theorem 2.14 (Local exponential contractivity). Suppose that the assump- 
tions from Section 2.1 are satisfied except for the condition liminf^oo n(r) > 0. 
Then for any t,R>0 and any probability measures /i, v on ~R d , 

w f R (m ,vpt) < exp (-c R t) W fR (ji,v) 

(2.30) + R ■ <t] + K[r R/2 < t]) , 

where (X^P^) is a diffusion process satisfying (1.1) with initial distribution fi, 
Tr/i — inf{t > : \\X t \\ > R/2} denotes the first exit time from the ball of radius 
R/2 around 0, and 

R r s i s \ 

(2.31) — = a J $(s)<^(s) _1 ds = a J J exp l~^J uk{u)~~ du J dtds. 

o o o \ t J 

The proof of the theorem is given in Section 5. In applications, the exit proba- 
bilities are typically estimated by using appropriate Lyapunov functions. 

Example 2.15 (Stochastic heat equation). We consider the diffusion in IR d_1 
given by X® = Xf = and 

(2.32) dX\ = [d 2 {Xl +1 - 2X l t + + V'(Xi)] dt + VddB l t , 

i = 1, . . . , d — 1, where V : M — > R is a C 2 function such that V" > —L for a finite 
constant Lei. The equation (2.32) is a spatial discretization at the grid points 
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i/d (i = 0,1, ... ,d) of the stochastic heat equation with space-time white noise 
and Dirichlet boundary conditions on the interval [0, 1] given by 

(2.33) du = A Dir u + V'(u)dt + dW 

with the Dirichlet Laplacian A^h- on the interval [0, 1] and a cylindrical Wiener 
process (W t )t>o over the Hilbert space L 2 (0, 1). We observe that (2.32) is of the 
form (1.1) with a = s/dl^-i and b = —dVU where 

u( X ) = ^K-^r + ^E^) 

i=l i=0 

for x = {x 1 , . . . , x d_1 ) G and x° = x d = 0. By the discrete Poincare inequality, 

d d-l 

^l^-x'- 1 ! 2 > 2(1 -cos(7r/d)) l x T- 

i=l i=l 

Hence for any x, ^ G and x° = x d = £° = C, d = 0, the lower bound 

d d-l „ d-l 



dhu(x) = dj2\e-r 1 \ 2 + lY, v "( xi )W\ 2 > ^El^ 

i=l i=l i=l 

holds with ^ = 2 <i 2 (1 — cos(7r/<i)) — L, and thus 



(x-i/)-(6(x)-6(i/)) = -d{x-y)-{VU{x)-VU{y)) < -K d \x-y 



2 



for any x, y G M. d 1 where | ■ | denotes the Euclidean norm. Choosing for || • || the 
intrinsic metric = c/~ 1//2 |x|, we obtain 

k(t) > 2 K,i for any r > 0. 

In particular, the function k is bounded from below uniformly by a real constant 
that does not depend on the dimension d since 

(2.34) lim K d = vr 2 - L > -oo. 

d— ¥oo 

Theorem 2.14 now shows that for any R > 0, local exponential contractivity in 
the sense of (2.30) holds on the ball 

B R/2 = {x G R^ 1 : ||x|| < R/2} = {x G M^" 1 : |x| < d 1/2 R/2} 

with rate cr satisfying 

— < 4 v / ^ J R _1 |A'J~ 3/2 exp(-J^ J R 2 /4) for K d R 2 < -4, 

— < (e - l)i? 2 /2 for - 4 < K d i? 2 < 0, 

cr 

— < R 2 /2 for = respectively. 

cr 

Here the explicit upper bounds are obtained analogously as in the proof of Lemma 
2.9. For K d > 0, strict convexity holds, and we obtain global exponential con- 
tractivity with a dimension-independent rate. We remark that because of (2.34), 
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the bounds also carry over to the limiting SPDE (2.33) for which they imply local 
exponential contractivity on balls w.r.t. the L 2 norm. 

3. Main results on product spaces 

3.1. Componentwise reflection coupling and contractivity on product 
spaces. We now consider a system 

(3.1) dX\ = b\X t ) dt + dBl % = 1, . . . , n, 

of n interacting diffusion processes taking values in Mr*, d{ G N. Here B\ i = 
1, . . . , n, are independent Brownian motions in M. di , X = (X 1 , . . . , X n ) is a diffusion 
process taking values in M d where d = ^™ =1 <^, and b % : M. d — » Mr* are locally 
Lipschitz continuous functions. We will assume that 

(3.2) b\x) = 6j(x*) + 7*(x), i = l,...,n, 

where the functions b l Q : M di —> M di are locally Lipschitz continuous, and 7* : M. d — > 
Mr* are "sufficiently small" perturbations, cf. Theorem 3.1 below. In particular, 
for 7* = the components X , . . . , X n are independent. 

To analyse contraction properties of the process X, one could use a reflection 
coupling on M d and apply the results above based on a distance function of the 
form df(x,y) = f(\x — y\). In some applications, this approach does indeed pro- 
vide dimension-free bounds, cf. Example 2.15 above. However, in the product case 
Y = it leads in general to lower bounds for contraction rates that degenerate 
rapidly as n — > 00, even though one would expect exponential contractivity with 
the minimum of the contraction rates for the components. The reason is that the 
approach requires convexity outside a Euclidean ball in Mr whereas in correspond- 
ing product models, in general convexity only holds if all components are outside 
given balls in M di . 

Instead, we now consider contractivity w.r.t. Kantorovich distances W 'f tW based 
on distance functions on M d = vdn of the form 

n 

(3.3) d ftW (x,y) = ^M^-y'Dwi. 

i=l 

Here fa : [0, 00) — > [0, 00), 1 < i < n, are strictly increasing concave C 1 functions 
with /j(0) = and //(O) = 1 that are obtained from b l in the same way as / 
has been obtained from b above, and u>; G (0, 1] are positive weights. In many 
applications, one can choose Wi = 1 for any %. The corresponding distance will 
then be denoted by d\j. Notice that d\j is bounded from above by the i 1 distance 

n 
i=l 

Hence W\j is bounded from above by the Kantorovich distance Wg\ based on dgi . 
For r G (0, 00) let 

(3.4) Ki(r) = r~ 2 inf {-2 {x - y) ■ %(x) - b^y)) : x, y G R d s.t. \x - y\ = r) . 
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Similarly as above, we assume that for 1 < % < n, 

(3.5) Ki : (0, oo) — > R is continuous with liminf «j(r) > 0. 

r— >oo 

Moreover, we assume 

(3.6) limrKj(r) = 0. 

r— >0 

Let Rq, R\, gi(r), <fi(r), fi(r) and <&j(r) = ipi(s) ds be defined analogously to 

(2.4) , (2.5) and (2.6) with k replaced by Moreover, we define c$ G (0, oo) by 

(3.7) — = J $.j(s)(yjj(s) _1 (is = J J ex P [4. J UK i( u )~ du\ dt ds . 

\ t / 

Recall that by Theorem 2.2 and Corollary 2.3, q is a lower bound for the contrac- 
tion rate of the diffusion process X 1 on M di satisfying the s.d.e. dX\ = b l (Xl) + dBl. 

Let pt(x, dy) denote the transition kernels of the diffusion process X t = (X^, . . . , Xf) 
on M. d satisfying (3.1). We now state our second main result: 

Theorem 3.1 (Exponential contractivity on product spaces). Suppose that 

(3.5) and (3.6) hold, and suppose that there exist constants £j 6 [0, Cj), 1 < i < n, 
such that for any x, y G M. , 

n n 

(3.8) ^lYOzO-y^K < ^eiMtf-y'Dwi. 

i=l i=l 

Then for any t > and any probability measures fi, v on M. d , 

(3.9) Wf !W (fJ,p t ,i>Pt) < exp(-ct) W f , w (iJ,,v), and 

(3.10) Wpfaptivpt) < Aexp(-ct)WiL(ji,v), 

where c = min (q — e 4 ) and A = 2 min {(pi(R l )wi) . 

i=l,...,n J i=l,...,n 

Example 3.2 (Product model). In the product case, 7* = for any %. Hence 
Condition (3.8) is satisfied with E{ = 0, and, therefore, 

Wf, w (np u vp t ) < exp(-ct)W f:W (n,v) 

holds with c = min q for any choice of the weights w±, . . . ,w n . 

More generally than in the example, suppose now that 7 = (7 1 , . . . , 7") satisfies 
an £ 1 -Lipschitz condition 

n n 

(3.11) X]|f(z)-f(j/)l < A^|^-^| Wx,yeR d . 

i=l i=l 

Then exponential contractivity holds for the perturbed product model provided 
A < CjV9(_Rq)/2 for any i: 
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Corollary 3.3 (Perturbations of product models). Suppose that (3.2), (3.5), 
(3.6) and (3.11) hold with A G [0, oo). Then for any t > and any probability 
measures fi, v on ~R d , 

(3.12) W fjl (np t ,vp t ) < exp(-ct) W f>1 (ii, v), and 

(3.13) W e i(m,m) < Aexp(-ct)W e i(fi,u), 

where c= min (q — 2\ipi(R t )~ 1 ) and A = 2 max (^(.Rq) -1 . 

i=l,...n i=l, ...n 

The inituitive idea of proof for Theorem 3.1 is to construct a coupling (X t ,Y t ) 
of two solutions of (3.1) by applying a reflection coupling individually for each 
component (XI, Y t l ) if X\ ^ Y t \ and a basic coupling if X\ = Y t l . In the product 
case this just means that X\ = Y t l for any t > r l where r % = inf{t > : X\ = Y t 1 } 
is the coupling time for the z-th component. In the non-product case, however, 
X\ and Y t l can move apart again after the time t % due to interactions with other 
components. In that case it is not clear how to define a coupling as described above 
rigorously. Instead we will use a regularized version where reflection coupling is 
applied to the i-th component whenever \X\ — Y t l \ > 5 for a given constant 5 > 0, 
and basic coupling is applied whenever \X\ — Y t l \ < 6/2. A precise description of 
the coupling and the proofs of Theorem 3.1 and Corollary 3.3 are given in Sections 
6 and 7 below. 



3.2. Consequences. The contractivity results in Theorem 3.1 and Corollary 3.3 
have corresponding consequences as the contractivity results in the non-product 
case, cf. Section 2.2 above. An important difference to be noted is, however, that 
on product spaces, 

n 

d f;W (x,y) < ^Ix'-y 1 ] < n l/2 \x-y\ 

i=l 

by the Cauchy-Schwarz inequality. Therefore, an additional factor n occurs in the 
variance bounds from Corollaries 2.5, 2.6 and 2.7 on product spaces. Apart from 
this additional factor, all results in Section 2.2 carry over to the setup considered 
in Section 3.1. 



3.3. Interacting Langevin diffusions. As an illustration of the results in Sec- 
tion 3.1, we consider a system 

1 n 
(3.14) dX\ = -~VU(Xl) dt - V a tj VV{X l t - X[) dt + dB\ 

of n interacting overdamped Langevin diffusions taking values in M. k for some 
k G N. Here B l , . . . ,B n are independent Brownian motions in M fc , U G C 2 (R k ) 
is strictly convex outside a given ball, the interaction potential V is in C 2 (R k ) 
with bounded second derivatives, and a^, 1 < i,j < n, are finite real constants. 
For example, we are interested in nearest-neighbour interactions and mean-field 
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interactions given by 

, s ( a/2 if i — j = 1 mod n or i — j = — 1 mod n, 

^ ' ' | otherwise, 

(3.16) ay = an -1 respectively, 
where a 6 M is a finite coupling constant. 

Choosing &o( x *) = ~~ VU(x l )/2 and 7*(x) = Y^Ij=i ^ijW{x l — we observe 
that the function 

«i(r) = infj^ S^/i^i - t)x + ^) cZt :x, ye R fe s.t. |x - y| = r J 

does not depend on i. Let </? and / be the corresponding functions given by (2.6), 
and consider the distance 

n 
i=l 

Morover, let c be given by (2.8) with a — 1, i.e., c is the lower bound for the 
contraction rate of the diffusion process Y in M. k satisfying dY = —~VU(Y) dt + 
dB. We note that 7 satisfies the I 1 Lipschitz condition (3.11) with 

n 

A = M ■ max (\ a ij \ + I a ji I ) 
j'=i 

where M = sup 1 1 \Z 2 1 1 . Therefore, if 

i=i 

then by Corollary 3.3, contractivity in the sense of (3.12) holds with contraction 
rate 

c = c-2\tp(R )- 1 > 0. 

In particular, in the nearest neighbour and mean field case, we obtain contractivity 
with a rate that does not depend on the dimension if a is small: 

Corollary 3.4 (Mean field and nearest neighbour interactions). Let p t , 

t > 0, denote the transition kernels of the diffusion process on R nfc solving (3.14) . 
Suppose that sup || V 2 V^ || < 00 and that aij is given by (3.15) or by (3.16) with 
aGl Then there exist finite constants c,9,A<E (0, 00) that do not depend on the 
dimension n such that 

(3.17) Wf tl ( m ,v Pt ) < e( ea - c)t W fA (fi,v), and 

(3.18) Wei(m^Pt) < AeV^Wfifav), 

hold for any t > and any probability measures /i, v on W nk . In particular, expo- 
nential contractivity holds for a < c/9. 
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The bounds in (3.17) and (3.18) are not sharp. However, it is known that 
for example in mean field models where U is a double-well potential and V is 
quadratic, exponential contractivity with a rate independent of the dimension 
can not be expected to hold for large a. Indeed, in this case the corresponding 
McKean-Vlasov process has several stationary distributions if a > cki for some 
critical parameter a.\ G (0, oo), cf. [21, 22]. 

4. Proofs for Reflection Coupling 

In this section, we first motivate our particular choice of the function /, and we 
prove Theorem 2.2. Afterwards, we prove Corollaries 2.3, 2.5, 2.6 and 2.7. 

Let r t = \\X t — Y t \\ where (X, Y) is a reflection coupling of two solutions of (1.1). 
Our goal is to find an explicit concave increasing function / : [0, oo) —> [0, oo) with 
/(0) = and f'(0) = 1 such that e ct f(r t ) is a (local) super martingale for t less 
than the coupling time T with a constant c > that we are trying to maximize 
by the choice of /. 

An application of Ito's formula to the s.d.e. (1.3) satisfied by the difference 
process Z t = X t — Y t shows that the following Ito equations hold almost surely for 
t < T whenever / is C l and /' is absolutely continuous: 

d\\Z t f = A\a~ l Z t \- l \\Z t \\ 2 dW t 

+ 2Z t -G(b(X t )-b(Y t ))dt + A\a- 1 Z t \- 2 \\Z t \\ 2 dt, 

dr t = 2 \a- x Z t \~ x v t dW t + r~ x Z t ■ G(b(X t ) - b(Y t )) dt, and 

df(r t ) = 2\a~ 1 Z t \- 1 r t f'(r t )dW t 

(4.1) +r~ 1 Z t ■ G(b(X t ) - b(Y t ))f\r t ) dt + 2 Z t \- 2 r 2 t f\r t ) dt. 

By definition of the function k, the drift term on the right hand side of (4.1) is 
bounded from above by 

(4.2) A := 2 \a^Z t r 2 r 2 ■ (f"(r t ) - ~ r t K{r t )f{r t )) . 

Hence the process e ct f(r t ) is a supermartingale for t < T if f3 t < —cf(r t ). Since 

(4.3) \°~ l z\ 2 < "IMP for any z G R d , 
a sufficient condition is 

cue 

(4.4) f"(r) - -r«(r)/'(r) < — ^"/( r ) for a - e - r > °- 

We now first observe that this equation holds with c = (i.e., f(r t ) is a super- 
martingale for t < T) if / is chosen such that f'{r) = <p(r) = exp(— J Q r sn(s)~ds/4). 
Indeed, f(r) = <p(s) ds is the least concave among all concave functions / sat- 
isfying p t < 0. 

To satisfy the stronger condition (3 t < —cf(r t ) with c < 0, we make the ansatz 

(4.5) f'{r) = (fi(r)g(r) 

with a decreasing absolutely continuous function g > 1/2 such that g(0) = 1. 
Note that the condition g > is required to ensure that / is non-decreasing. By 
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replacing this condition by the stronger condition g > 1/2, we are loosing at most 
a factor 2 in the estimates below. On the other hand, the condition 1/2 < g < 1 
has the huge advantage of ensuring that 

(4.6) $/2 < / < $ 
where $(r) = f Q r ip(s)ds. The ansatz (4.5) yields 

/" = ~\ rK 'f + <Psf < \rnf + cpg', 
i.e., Condition (4.4) is satisfied if 

OLC 

(4.7) g' < ——f/tp. almost surely. 

We will see in the proof below that for r > Ri, Condition (4.4) is automatically 
satisfied since n is sufficiently positive. Therefore, it is enough to assume that (4.7) 
holds on (0,Ri). 

Now on the one hand, if (4.7) is satisfied on (0, R\) then 
nc f Rl rvr f Rl 

This condition can only be satisfied with a function g taking values in [1/2, 1] if 

ac < 2 I $(s)^(s)" 1 rfs . 
On the other hand, by choosing 

(4.8) g'(r) = I ^ ds for r < R i> 
Condition (4.7) is satisfied with the constant 

ac = I I J $(s)v?(s) _1 ds . 

This shows that up to a factor 2, choosing g as in (4.8) is the best we can do under 
the assumptions that we have made. 

The considerations above explain the particular choice of the function / made 
in (2.6). Once this choice has been made, the proof of Theorem 2.2 is almost 
st r aight forward : 



Proof of Theorem 2.2. As remarked above, the drift in the s.d.e. (4.1) for 
f{r t ) is bounded from above by (3 t defined by (4.2). We now show that by our 
choice of / in (2.6), this expression is smaller than —cf(r t ) where c is given by 
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(2.8). Indeed, for r < R 1 , 



Ri 



(4.9) f"(r) = ~rn(r) (p{r)g{r) - ^$(r) / J $(s)<p(s) 1 ds 

' o 

/Ri 
J QisMs)- 1 ds . 


For r > Ri, we have /' (r) = y>(r)/2 = <f(Ro)/2 and k{t)R 1 {R 1 - Rq) > 8 by 
definition of whence 

/ ir) - -rK{r)f (r) = --rK{r)tp(Ro) < - — _^ • — 



- _ ^ /(r) / J Q Rl H^r 1 ds . 



Here we have used that for r > i?o, the function </?(r) is constant, and, therefore, 
$(r) = $(i2o) + ( r - Ro)<f(Ro), and 

/•Hi /--Ri 

/ ^(s)^)- 1 = / (<f( J R ) + (s- J Ro)^(i?o))^( J Ro)" 1 ^ 

= $(^ )^(^o)" 1 (i?i- J Ro) + (Ri-Ro) 2 /2 

> {R x - Rq) ($(i? ) + (i?i - RqMRo)) tpiRo)" 1 /! 

= (R, - RoMR.MRoy 1 ^. 

By (4.9) and (4.10), we conclude that (3 t < —cf{r t ). Optional stopping in (4.1) at 
Tfc = inf{t > : r t (A; -1 , k)} now implies 

E[f(r t ) ; t < T k ] < -c [ E[f(r s ) ;s<T k ] ds 

Jo 

for any k 6 N and t > 0. The assertion follows for — >■ oo since r f = for £ > T, 
and T = sup by non-explosiveness. □ 

Proof of Corollary 2.3. Let (X, Y) be a reflection coupling of two solutions 
of (1.1) with joint initial distribution (Xq,Yq) ~ 77. Then by Theorem 2.2, 

Wf( m ,u Pt ) < E[d f (X t ,Y t )] < e~ ct E[df(X ,Y )} 
= e~ ct / d f (x,y)r)(dxdy) 



for any t > 0. The estimate (2.9) now follows by taking the infimum over all 
couplings 7] of two given probability measures \x and i/ on M. d . Moreover, (2.10) 
follows from (2.9) by (2.11). □ 
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Next, we are going to prove the results in Section 2.2. Suppose that (2.12) holds, 
-II = 1°" ~ X A i s the intrinsic metric, and b is in C 1 . Corollary 2.3 implies 

y\pt(x,dy) < / \y\p t (x ,dy) + W l {p t (x, -),pt(x , ■)) < oo 



for any t > and any x G R d . In particular, (p t g)(x) = J g(y) pt(x,dy) is defined 
for any Lipschitz continuous function g : M. d — > R, and 

\(p t g)(x) - (p t g)(y)\ = \E[ g (X t ) - g (Y t )]\ < \\ g \\ Up{f) E[d f (X t ,Y t )} 

for any coupling (X t , Y t ) of Pt(x, •) and pt(y, •)■ Hence by Theorem 2.2, 

(4.11) \(p t g)(x) - (p t g)(y)\ < e~ ct \\g\\ Lip ( f ) d f (x,y), 

i.e., pt satisfies the exponential contractivity condition (2.13) w.r.t. || ■ ||up(/)- If 
Ptg is C l then by (4.11) and since 

df(x,y) < \\x-y\\ = \a~ l (x-y)\ \/x,yeM. d , 

we obtain the uniform gradient bound 

(4.12) sup\a T Vp t g\ < e- ct \\g\\ Lipif) V t > 0. 

It is well-known that this bound can be used to control variances w.r.t. the mea- 
sures Pt(x, ■): 

Lemma 4.1. For any t>0,xe R d , and any Lipschitz continuous g : R d — > R, 
(4-13) Var M *,.)(<7) < 1 ~ ^ c IMIfW 

Proof. We may assume g G C 2 (R d ) and t > 0. Then, by standard elliptic 
regularity results, (t, sc) (ptg){x) is differentiable in t and x, and 

— PtP = Cp t g = p t Cg 

where C = \ ^ d ^ dxi + b(x) ■ V, a = <to" t , is the generator of (Xt), cf. e.g. 
[33, 31]. In particular, for s G (0, t), 

-^j-Ps(pt- s g) 2 = p s (C(p t - S g) 2 - 2pt- 3 g Cp t - S g) 

= Ps \a T V Pt . s g\ 2 < e-^\\g\\l W) 
by (4.12). Integrating w.r.t. s, we obtain 

2 / n2 ^ 1 - exp(-2ct) 2 

- (ptg) < y c wWuvuy 

which is equivalent to (4.13). □ 

By Lemma 4.1 and (4.11), we can now easily prove Corollaries 2.5, 2.6 and 2.7: 



Proof of Corollary 2.5. Existence and uniqueness of a stationary distribu- 
tion fj, for (pt)t>o satisfying J \y \ fi(dy) < oo follows easily as in [27], Section 3: By 
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Corollary 2.3, the map v t— > vp\ is a contraction w.r.t. the distance Wf (equiv- 
alent to W 1 ) on the complete metric space V 1 of all probability measures v on 
(M. d , B(M. d )) satisfying J \y\ fJ,(dy) < oo. Hence by the Banach fixed point theo- 
rem, there exists a unique probability measure /i such that [i pi — /i . It is then 
elementary to verify that the measure /i = J fiop s ds satisfies fip t = ji for any 
t G [0,1], and hence for any t G [0, oo). Moreover, by Corollary 2.3, 

W f (ji,vpt) = W f (jipt,vpt) < e" ct W/(/i,z/) 

for any v G V 1 . In particular, as t — > oo, Pt(x, ■) — > p. in V 1 for any x G M. d . 
The variance bound for \x now follows from the corresponding bound for pt(x, ■) in 
Lemma 4.1. □ 

Proof of Corollary 2.6. By Lemma 4.1, 

Cov (g(X t ),h{X t+s )) = E[g(X t )h(X t+s )} - E [g{X t )} E [h(X t+s )} 
= E[(gp s h)(X t )]-E\g(X t )}E[(p s h)(X t )} = Co VptM (g,p s h) 
< (1 - exp(-2ct)) (2c)' 1 ||5-||Lip(/)||M||Li P (/) 
for any s,t > 0. The assertion now follows by (4.11). □ 



Proof of Corollary 2.7. The bound for the bias follows immediately from 
(4.11), since 



E 



t L 9(x ' ] "•-/«*•] = ill 



(p s g(x ) - p s g{y)) ii{dy) ds 



Moreover, by Corollary 2.6, 



( - t J g(X s )ds 



< 



< 



< -J e cs ds \\g\\ Lip{f) J df(x ,y) fi(dy). 



Cov ( 1 jf g(X s )ds,j^ g(X s )ds 



f 2 



t r t 



Js 



Cov (g(X s ),g(X u )) duds 



\ 2 /( 1 " e_2CS ) fe- c(u - s) duds \\g\\l w) 



ct 2 
1 



c 2 t \\9WupV)- 



□ 



5. Examples 

We now prove the results in Sections 2.3 and 2.4, including in particular Lemma 
2.9, Lemma 2.12 and Theorem 2.14. 



Proof of Lemma 2.9 and Remark 2.10. We first prove the lower bounds on 
the exponential decay rate c in (2.8) stated in (2.19), (2.17) and (2.18). Notice that 
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the constant c denned by (2.8) increases if n{r) is replaced by a greater function. 
Indeed, for r > 0, 

r r / r \ 

(5.1) Kr M rr' = /^Mr)-'* = /exp i/^-J *, 

\ t / 

Ri 

whence Rq, R\ and c~ l = a J ^(s)(p(s)~ 1 ds are decreasing functions of k. 

o 

Convex Case. Suppose first that n(r) > for any r > and n(r) > if for r > R 
with constants if G (0, oo) and i? G [0, oo). Then i? = 0, R\ < max(i?, ^8/K), 
tp = 1, and hence 

c = (aRl/2)- 1 > ar 1 Twn.{ir 2 /2,KlA). 

Locally non-convex case. Now suppose that n(r) > — L for r < R and «(r) > A' 
for r > i? with constants K,L G (0, oo) and A G [0, oo]. Since <p(r) = <p(Ro) and 
$(r) = $(i?o) + (r - R )<p(R ) for r > i? , we have 

Hi 

y "V 1 = / $(s)</?(s) _1 



o 

Ho 



(5.2) = y S^Ms)" 1 ds + - Ro^RoMRo)- 1 + (i^ - i? ) 2 /2. 

o 

The lower curvature bounds imply the upper bounds 



(5.3) Rq < R, Ri-Rq < mm(8/(KR ),^/8/K), and 

r 

^(r)ip(r)- 1 < J exp(L(r 2 - t 2 )/8) dt 



(5.4) < min( ^2n/L, r) exp(Lr 2 /8) for r < Rq. 
Since exp x < 1 + (e — l)x for a; G [0, 1] and 

/X px 

exp(u 2 ) du < e+ / (2 — u~ 2 ) exp(u 2 ) du = x _1 exp(x 2 ) for x > 1, 



we can conclude that 



//•«o 

^(r)^^)" 1 dr < / rexp(Lr 2 /8)dr = 4L _1 (exp(Li2g/8) - 1) 

o ° 

< (e - l)i? 2 /2 if LR 2 /8 < 1, and 



/— , / — rV LR o/ 8 

^(r)ip(ry 1 dr < e xp(^) dr = / exp(w 2 ) du 



< Sv^rL- 3 / 2 ^ 1 exp(Li^/8) ifLi? 2 /8>l. 
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Combining these estimates, we obtain by (5.2), (5.3) and (5.4), 



crV 1 < (e- l)R 2 /2 + e v / 8/KR + 4:/K if LR 2 /8 < 1, and 

arV 1 < 8>/27rir 1 L- 1 / 2 (L -1 + K~ l ) exp(LR 2 /8) + 32R- 2 K~ 2 if LR 2 /8 > 1, 

where we have used that the function exp(x 2 ) is increasing for x > 1. □ 

Proofs for Example 2.11. Consider the one-dimensional Langevin diffusion 
(X t ) with drift —VU(x)/2 and generator 

(5.5) Cv = l -( v "-U'v') = ^e u (e- u v')'. 

The assumption liminf^i^oo U"(x) > implies that there is a unique strictly 
positive bounded eigenfunction v 1 e C 2 (0,oo) fl C([0, 00)) satisfying t>i(0) = 0, 
v[(0) = 1 and Lv\ = — X±Vi, where 

x wn x . r \i7 v \ x ) 2 exp(-U(x))dx 

Ai = Ai(0,oo) = mf r°° r \2 1 tti \\ -j 

veCg°(o,oo) J Q v(x) 2 exp(-U(x)) dx 

is the infimum of the spectrum of the self-adjoint realization of — £ with Dirichlet 
boundary conditions on (0, 00). Since Cv 1 = —X\Vi and vi is bounded, the process 
M t = exp(\it)vi(X t ) is a martingale. Optional stopping applied to the diffusion 
with initial condition Xo = xq shows that 

v^xo) = E X0 [M ] = E xo [M ToAt ] = E X0 [exp(Ai*)vi(X t ); r > t] 

(5.6) < exp(Ait)P :E0 [t > t] supv 1 

for any x > and t > 0. Since t>i(xo) > and supt>i < 00, the estimate (5.6) 
implies the asymptotic lower bound 

(5.7) lim inf r 1 log P Xo [t > t] > -Ai(0,oo). 

i— »oo 

Moreover, for any fixed t < X~ l /4, 

F«[r >t] > e- 1 / 4 v 1 {R)/supv 1 > 3/4 

provided vi(R) > |e 1 / 4 supt>i = 0.96... ■ supt>i- By the eigenfunction equation 
e u (e~ u v[y = —X1V1, one verifies that the latter condition is satisfied whenever U 
is growing fast enough on [R, 00). 

For bounding Ai(0, 00) from above let 

v(x) = min(VLx,l) = { f X * *| ^ 

By the assumptions on U, the function v is contained in the weighted Sobolev 
space # 1,2 ((0, 00), e~ u dx) (closure of qj°(0, 00) w.r.t. the norm \\w\\ 2 = f^°(w 2 + 
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O') 2 ) e~ u dx). Therefore, if LR 2 /A > 1 then (2.21) holds, since 
If v'(x) 2 exp(-U(x)) dx J 1/v/ ^Lexp(Lx 2 /2) dx 



/ v(x) 2 exp(-f/ (x)) rfx J*/ 2 t ,( x )2 exp (Lx 2 /2) dx 

L j'exp(y 2 /2)dy 3Le^ 2 [LR 2 ( LR 2 

< a / exp ' 



2 jj/LR /4 min( ^ l)2 exp ( y 2/ 2 ) rfy 4 V 8 

Here we have used that by assumption, U(x) > — Lx 2 /2 for any x G R with 
equality for |x| < R/2, and for x > 1, 



m(y,l) 2 e y 2 / 2 dy = I ...+ / ... > \ + -e**' 2 - 1 > -U* 2 / 2 



3 a; 3x 



as (x"V / 2 V = (1 - x" 2 ^* ' 2 < e x ' 2 . □ 



Proof of Lemma 2.12. Since b = b + 7, we have 
(x - y) • G(b(x) - b(y)) = (x - y) • G(6 (x) - 6 (y)) + (x - y) • G( 7 (x) - 7(2/)) 
for any x,y G R d . Therefore, by (2.24) and by definition of k and Ko, 

(5.8) n(r)~ < Ko( r )~ for any r < R, and 

(5.9) n(r)~ < Ko(^)~ + 4r _1 sup ||7|| for any r G (0, 00). 

In particular, if 7 is bounded then k satisfies the conditions in (2.3). Since the con- 
stant R\(b) defined w.r.t. b is smaller than the corresponding constant R\ defined 
w.r.t. bo, we obtain 



< / / exp - / uk{u) du dtds 
c - Jo Jo V4 J t J 

< J J ex P ^4 J UK o( u )~ du^j exp (i?sup H7II) dtds 

< — ■ exp (flsup H7II) , 

Co 

i.e., (2.25) holds. 

Similarly, if 7 satisfies the one-sided Lipschitz condition (2.26) then 
(5.10) K(r)~ < K (r)~ + 2L for any r G (0, 00). 

Hence again the conditions in (2.3) are satisfied, and we obtain 

I 1 / t rR 



- < — • exp I — / r dr 

c c V 2 Jo 

similarly as above, i.e., (2.27) holds. □ 

Proof of Theorem 2.14. Fix R > and probability measures /x, v on M. d . By 
definition of 
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for any r < R. Therefore, similarly to the proof of Theorem 2.2, Equation (4.1) 
shows that the process e CRt f R (r t ) is a local supermartingale for t < f R where 

f R = mi{t >0:r t >R}. 

Here r t = \\X t — Y t \\ again denotes the distance process for a reflection coupling 
(X t ,Y t ) of two solutions of (1.1) with initial distribution given by a coupling rj of 
ji and v. By optional stopping and Fatou's lemma, we thus obtain 

E[f R {r t y,T R >t} < E[f R (r tA + R )} < exp{-c R t)E[f R {r )] 

for any t > 0, and hence 

E[f R (r t )} < exp(-CRt)E[f R (r )} + F[f R < t] 

< e" CRt J f R (\\x - y\\ V (dxdy) + P p ,[r fl/2 < t] + F u [r R/2 < t}. 

The assertion now follows as in the proof of Corollary 2.3 by minimizing over all 
couplings 7] of /i and v. □ 

6. Couplings on product spaces 

Let d = J27=i^i w ith n,d\, . . . ,d n £ N. We now consider "componentwise" 
couplings for diffusion processes X t = (X^, . . . , X") and Y t = (Y t l , . . . , Y t n ) on W 1 
satisfying the s.d.e. 

(6.1) dX\ = b\X t ) dt + dB\, i = 1, . . . , n, 

with initial conditions X ~ \x and Y ~ v. Here B l , % = 1, . . . , n, are independent 
Brownian motions on M. d% , and b % : M. dt — > M. dt are locally Lipschitz continuous 
functions such that the unique strong solution of (6.1) is non-explosive for any 
given initial condition. 

Let 5 > 0. Suppose that A*, vr* : R d ->■ [0, 1], i = 1, . . . , n, are Lipschitz continu- 
ous functions such that 

(6.2) \\zf + n\z) 2 = 1 for any z £ M d , and 

(6.3) \\z) = if < 5/2, 

and let 5* and B l , 1 < z < n, be independent Brownian motions on M d \ Then a 
coupling of two solutions of (6.1) with initial distributions fi and v is given by a 
strong solution of the system 

(6.4) dX\ = b\X t )dt + \\Z t )dB\ + Ti\Z t )dB\, 

dY l t = V(Y t ) dt + \\Z t ) (I - ie t ef) dB\ + ^{Z t ) dB\, 

1 < % < n, with initial distribution (X , lo) ~ ^ where 77 is a coupling of \i and z/. 
Here we use the notation 

Zt = X t — Yt, 

and e\ is a measurable process taking values in the unit sphere in M. dt such that 

, . r if^vo, 

4 \ u* if = 0, 
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where u % is an arbitrary fixed unit vector in M. di . Notice that by (6.3), the choice 
of u l is not relevant for (6.4), which is a standard Ito s.d.e. in M? d with locally 
Lipschitz continuous coefficients. To see that (6.4) defines a coupling, we observe 
that (X t ) and (Y t ) satisfy (6.1) w.r.t. the processes B t = (B\, . . . , -Bf ) and B t = 
(Bl,...,B?) defined by 

B\ = [ \\Z s )dB\ + f ix\Z s )dB\, 
Jo Jo 

B\ = [ \ i {Z s ){I-2e i s ef)dB\+ [ <k\Z s ) dB\. 
Jo Jo 

By Levy's characterization and by (6.2), both B and B are indeed Brownian 
motions in M. d , cp. the corresponding argument for reflection coupling. 

Remark 6.1. (1) By Condition (6.3) and non-explosiveness of (6.1), the coupling 
process (X t , Y t ) is defined for any t > 0. 

(2) By choosing A* = and 7r* = 1 we recover the basic coupling, i.e., the same 
noise is applied to both processes X and Y . 

(3) A componentwise reflection coupling would be informally given by choosing 
\ % {z) = 1 if z % 7^ and A* (z) = if z l = 0. As this function is not continuous 
and e l (z) = z l /\z l \ also has a discontinuity at zero, it is not obvious how to 
make sense of this coupling rigorously. Instead, we will use below an approximate 
componentwise reflection coupling where A l (z) = 1 if \z l \ > 5 and \ % {z) = if 
\z % \ < 8/2 for a small positive constant S. 

By subtracting the equations for X and Y in (6.4), we see that the difference 
process Z = X — Y satisfies the s.d.e. 

(6.5) dZ\ = (b\X t ) - b\Y t )) dt + 2\\Z t ) e] dW\ , 
i — 1, . . . , n, where the processes 

Wl = [ efdBl, l<i<n, 
Jo 

are independent one- dimensional Brownian motions. 

Let r\ = \X\ — Y%\ denote the Euclidean norm of Z\. The next lemma is crucial 
for quantifying contraction properties of the coupling given by (6.4): 

Lemma 6.2. Suppose that f : [0, oo) —> [0, oo) is a strictly increasing concave 
function in C 1 ([0,oo)) such that f is absolutely continuous on (0, oo). Then for 
any i — 1, . . . , n, the process f{r\) satisfies the Ito equation 

f{r\) = M) + 2 f \\X s -Y s )f\rl)dWl 
Jo 

(6.6) + f {e\-{b\X s )-V(Y s ))f\rl) + 2\\X S - Y s f f" (rj)} da. 

Jo 
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Remark 6.3. The lemma shows in particular that the process r\ satisfies 

(6.7) dr\ = e\ ■ (b\X t ) - b\Y t )) dt + 2X\X t - Y t ) dW\. 

Notice that in this equation, the drift term does not depend on the choice of A. 

Proof of Lemma 6.2. Recall that e\ = Z\j\Z\\ if r\ = \Z\\ ^ 0. Since the 
function y i— > y/\y\ is smooth on \ {0} and x \- > \fx is smooth on (0, oo), we 
can apply Ito's formula and (6.5) to show that the Ito equations 

d\Z l \ 2 = 2Z l ■ [b\X) -V(Y))dt + A\\Z) 2 dt + AX\Z) \Z l \dW\ 

dr' = —dlZf - —^—d[\Zf) 
2r t 1 1 8 ( r *)3 Li I 

(6.8) = e l ■{b i {X)-b i {Y))dt + 2\\X-Y)dW i 

hold almost surely on any stochastic interval [ti,T2] such that Z\ ^ a.s. for 

71 < t < T 2 . 

On the other hand, suppose that \Z l \ < 5/2 a.s. on a stochastic interval [73, rj. 
Then on [73,74], \(Z) = by (6.3), and hence Z % is almost surely absolutely 
continuous with 

dZydt = b\X)-b\Y) a.e. on [r 3 , r 4 ]. 

This implies that r % = \Z l \ is almost surely absolutely continuous on [73, 74] as well 
with 

(6.9) drydt = e i ■ {b\X) - b\Y)) a.e. on [73,74], 

which is equivalent to (6.7) on [73, r 4 ]. Note that the value of e l for Z l = is 
not relevant here, since Z % can only stay at for a positive amount of time if 
b l (X) — UiY) vanishes during that time interval. 

Since M + is the union of countably many stochastic intervals of the first and 
second type considered above, the Ito equation (6.7) holds almost surely on M + . 
The assertion (6.6) now follows from (6.7) by another application of Ito's formula. 
Here it is enough to assume that / is C l on [0, 00) and /' is absolutely continuous 
on (0, 00) because \ l (X s — Y s ) vanishes for r\ < 5/2. □ 

We now fix weights w±, . . . w n G [0, 00) and strictly increasing concave functions 
h, ...J n G ^([O, 00)) n C 2 ((0, oo)) such that £(0) = for any i. Consider 

n 

(6.10) pt = Y,M r D w * = dfA X n Y t) 

i=l 

where df jW is defined by (3.3). By Lemma 6.2, 

n 

d Pt = E ( e * • ~ f'M) + 2X ' 1 ^ ~ F *) 2 f"( r t)) w i dt 

i=l 

11 

(6.11) +2Y,X(X t -Y t )f>{rl)dWl 

i=l 
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Notice that the last term on the right hand side is a martingale since A* and f[ 
are bounded. This enables us to control the expectation E[p t ] if we can bound the 
drift in (6.11) by m — cpt for constants m,c£ (0, oo): 

Lemma 6.4. Let m,c G (0, oo) and suppose that 
(6-12) 

it ( c /*( r *) + ~ V*) ■ " ^T- + 2X ^ X - f"^ 1 )) u>i < m 

i=i ^ r / 

holds for any x,y G M. d with r % := \x l — y l \ > Vz G {1, . . .n}. Then 

(6.13) E[p t ] < e~ ct E[p ] + m (1 - e~ ct )/c for any t > 0. 
Proof. We first note that by continuity of b % and f-, (6.12) implies that 

n 

(6.14) ( c *( r< ) + e * • M*) - ^( r *) + 2A ^ - ^ tf( r< )) «>i < ™ 

i=l 

holds for any x, y G M. d (even if x % — y % = 0) provided e* = (a; 1 — y l )/r l if r l > 
and e l is an arbitrary unit vector if r l = 0. Indeed, we obtain (6.14) by applying 
(6.12) with x l replaced by x % + he 1 whenever x' 1 — y % = and taking the limit as 
h 4- 0. In particular, by (6.14), the drift term (3 t in (6.11) is bounded from above 

by 

n 

fit < rn - ^ cfijrpWj = m - cp t . 

i=l 

Therefore by (6.11) and by the Ito product rule, 

d( e ct p) = e ct dp + ce ct pdt < e ct mdt + dM 
where M is a martingale, and thus 

E[e ct p t ] < E[p ] + m [ e cs ds for any t > 0. 

Jo 

□ 



Since //' < 0, the process p t is decreasing more rapidly (or growing more slowly) 
if A* takes larger values. In particular, the decay properties of p t would be opti- 
mized when X i (z) = 1 for any z with z l ^ 0. This optimal choice of A 1 , ... , A n 
would correspond to a componentwise reflection coupling, but it violates Condition 
(6.3). It is perhaps possible to construct a corresponding coupling process by an 
approximation argument. For our purpose of bounding the Kantorovich distance 
Wf tW (/j,pt,vpt) this is not necessary. Indeed, it will be sufficient to consider ap- 
proximate componentwise reflection couplings where (6.2) and (6.3) are satisfied 
and A* (z) = 1 whenever \z l \ > 5. The limit 5 \. will then be considered for the 
resulting estimates of the Kantorovich distance but not for the coupling processes. 
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7. Application to interacting diffusions 

We will now apply the couplings introduced in Section 6 to prove the contraction 
properties for systems of interacting diffusions stated in Theorem 3.1 and Corollary 
3.3. We consider the setup described in Section 3.1, i.e., 

(7.1) b\x) = b^x*) + Y(x) for i = 1, . . . , n 

with 5q : IR di — » M. di locally Lipschitz such that Ki defined by (3.4) is continuous 
on (0, oo) with 

(7.2) liminf Kj(r) > and limrKj(r) = for any 1 < i < n. 

The functions fa are defined via Ki, and q is the corresponding contraction rate 
given by (3.7). 

Proof of Theorem 3.1. We fix 5 > and Lipschitz continuous functions 
: R d -> [0,1], 1 < i < n, such that (6.2) and (6.3) hold and \\z) = 
1 if > 5. Let (Xt,Yt) denote a corresponding approximate componentwise 
reflection coupling of two solutions of (3.1) given by (6.4), and let p t = df jW (X t , Y t ). 
We will apply Lemma 6.4 which requires bounding the right hand side in (6.12). 
For this purpose recall that fa and q have been chosen in such a way that 

2f>{r)- l ~rK i {r)f i {r) < -cj^r) V r > 0, 

cf. (4.9) and (4.10). Therefore, by (7.1) and by definition of Ki, 

(x l - y 1 ) ■ (b\x) - b\y)) yj(r*)/r* + 2\\x - yf /*(r*) 

< -IrX( r *)/;( r *) + |7*(x) -7*(y)l^) + 2\\x - yf fJ^) 

(7.3) < -\\x - yjViCr 4 ) + |t*(x) - T*(y)| - ^(1 - A*(z - y) 2 ) rVW) 

< -Cifiir 1 ) + 17X^-7%)! + c.^ + iUup (r«i(r) _ ) 

^ r<<5 

for any x, y G M d with r* = |x l — > 0. Here we have used that < f[ < 1, and 
that A* (a; — y) ^ 1 only if r % < 5. In this case, fi(r l ) < r % < 5. By (7.3) and by 
the assumption (3.8) on 7 1 , we obtain 

n 

((*' - V 1 ) ■ - b\y)) flir 1 )!^ + 2\\x - yf f^r 1 )) w t 

i=i 

n n 

< m(5) + ^2(-c l + e i )f i (r i )w i < m(5) - c^fi^Wi 

i=l i=l 

for x, y as above, where 
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is a finite constant by (7.2), and c = minj = i i ... n (ci — £j). Hence (6.12) is satisfied 
with c and m{5) and, therefore, 

(7.4) E[p t ] < e~ ct E[ Po ]+m(5)(l-e- ct )/c. 

By choosing the coupling process (X t , Y t ) with initial distribution given by a cou- 
pling 7] of probability measures \x and v on we conclude that 

W f>w {m,w) < E[d ftW {X t ,Y t )] = E[ Pt ] 

(7.5) < e~ ct J d f , w (x, y) r](dx dy) + m{8) (1 - e" ci )/c 

for any t > 0. Moreover, by (3.6), m(5) — ?• as 5 J, 0. Hence the assertion (3.9) 
follows from (7.5) by taking the limit as 5 4 and minimizing over all couplings rj 
of ji and v. Finally, (3.10) follows from (3.9) since (f(R l )r/2 < fi(r) < r implies 

A^d^x.y) < df !W (x,y) = ^2 fi(\x l - y l \) Wi < d t i(x,y). □ 

Proof of Corollary 3.3. The £ 1 -Lipschitz condition (3.11) for 7 implies that 
(3.8) holds with Wi — 1 for any i, and 

Xe; 1 = ini :/ 4 (r) = = ^(^)/2, 

r>0 

i.e., £j = 2X/ifi(R l Q ). The assertion now follows from Theorem 3.1. 

□ 
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